SuffixMiner: Efficiently Mining Frequent Itemsets in Data Streams by Suffix-Forest

نویسندگان

  • Lifeng Jia
  • Chunguang Zhou
  • Zhe Wang
  • Xiujuan Xu
چکیده

We proposed a new algorithm SuffixMiner which eliminates the requirement of multiple passes through the data when finding out all frequent itemsets in data streams, takes full advantage of the special property of suffixtree to avoid generating candidate itemsets and traversing each suffix-tree during the itemset growth, and utilizes a new itemset growth method to mine all frequent itemsets in data streams. Experiment results show that the SuffixMiner algorithm not only has an excellent scalability to mine frequent itemsets over data streams, but also outperforms Apriori and Fp-Growth algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CLAIM: An Efficient Method for Relaxed Frequent Closed Itemsets Mining over Stream Data

Recently, frequent itemsets mining over data streams attracted much attention. However, mining closed itemsets from data stream has not been well addressed. The main difficulty lies in its high complexity of maintenance aroused by the exact model definition of closed itemsets and the dynamic changing of data streams. In data stream scenario, it is sufficient to mining only approximated frequent...

متن کامل

Efficient mining of temporal emerging itemsets from data streams

In this paper, we propose a new method, namely EFI-Mine, for mining temporal emerging frequent itemsets from data streams efficiently and effectively. The temporal emerging frequent itemsets are those that are infrequent in the current time window of data stream but have high potential to become frequent in the subsequent time windows. Discovery of emerging frequent itemsets is an important pro...

متن کامل

Top-k-FCI: Mining Top-K Frequent Closed Itemsets in Data Streams

With the generation and analysis of stream data, such as network monitoring in real time, log records, click streams, a great deal of attention has been concerned on data streams mining in the field of data mining. In the process of the data streams mining, it is more reasonable to ask users to set a bound on the result size. Therefore, in this paper, an real-time single-pass algorithm, called ...

متن کامل

Concept Shift Detection for Frequent Itemsets from Sliding Windows over Data Streams

In a mobile business collaboration environment, frequent itemsets analysis will discover the noticeable associated events and data to provide important information of user behaviors. Many algorithms have been proposed for mining frequent itemsets over data streams. However, in many practical situations where the data arrival rate is very high, continuous mining the data sets within a sliding wi...

متن کامل

A Simple but Effective Maximal Frequent Itemset Mining Algorithm over Streams

Maximal frequent itemsets are one of several condensed representations of frequent itemsets, which store most of the information contained in frequent itemsets using less space, thus being more suitable for stream mining. This paper considers a simple but effective algorithm for mining maximal frequent itemsets over a stream landmark. We design a compact data structure named FP-FOREST to improv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005